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1.  INTRODUCTION 


^  Until  recently,  research  and  development  efforts  in  the  database  management  systems  area  were  focused  on 
supptxting  traditional  business  qtplications.  The  design  of  database  systems  ct^Ue  of  siqrporting  non-traditional 
plication  areas,  including  engineering  applications  for  CAD/CAM  and  VLSI  data,  scientific  and  statistical  appli- 
catkxis,  expert  database  systems  (which  can  really  be  Wewed  as  adding  database  system  technology  to  an  AI  appli¬ 
cation),  and  image/voice  applications,  has  now  emerged  as  an  important  new  direction  for  database  system  research. 
These  new  implications  differ  from  more  conventional  applications  like  transaction  processing  and  record  keeping 
in  a  number  of  important  areas: 


^1)  Data  m^el^  rcquirementsj  >  Each  new  application  area  requires  a  different  set  of  data  modeling  tools. 
.""nK^ypu  of  entities  aridlelau^hips  that  must  be  described  for  a  VLSI  circuit  design  ate  quite  c^erent 
f  firom  the  data  modeling  requirements  of  a  banking  application. 

(f^2)  Processing  functioiuUty^jEach  new  application  area  has  a  specialized  set  of  operations  that  must  be  sup- 
^potvsd  by  the  database  system.  Consider,  for  example,  a  database  containing  satellite  images.  It  makes  little 
/  sense  to  talk  about  doing  joins  between  these  images  or  even  components  of  a  single  image.  Instead,  we  ate 
'  more  likely  to  be  interested  in  analyzing  the  image  using  specialized  image  processing  and/or  recognition 
i  algorithms.  For  example,  if  we  ate  looking  fm  cnm  diseases  wc  will  want  to  apply  an  algorithm  that  exam¬ 
ines  the  images  for  the  crop  disease  signatures.  As  anotho'  example,  if  we  wanted  to  look  for  particular  types 
of  ships,  we  would  be  interested  in  using  an  image  recognition  algorithm  to  compare  the  satellite  images  with 
those  of  the  relevant  types  of  ships.  As  we  will  discuss  in  mote  detail  below,  we  contend  that  it  does  not 
make  sense  to  implement  such  algorithms  in  terms  of  relational  database  operations  (or  CODASYL  or  net¬ 
work  data  model  primitives  either). 

^3)  Concurrency  control  and  recovery  mechanisms, ->Each  new  application  area  also  has  slightly  different 


I  requirements  for  concurrency  control  and  tecov^y  mechanisms.  While  locking  and  logging  are  the  accepted 
/  mechanisms  for  conventional  database  applications,  a  versioning  mechanism  looks  most  appropriate  for 
/  engineering  applications.  For  image  databases,  which  tend  to  be  principally  accessed  in  a  read-only  fashion, 

/  perhaps  no  concurrency  control  or  recovery  mechanism  is  needed. 

Access  methods  and  storage  structures.  Each  new  application  area  also  has  dramatically  different  r^uire- 
ments  for  access  methods  and  storage  stnKtuies.  Access  and  manipulation  of  VLSI  datidnses  is  facilitated 
by  new  access  methods  such  as  R-Trees.  Storage  of  image  data  is  greatly  simplified  if  the  database  system 
supports  large  multidimensional  arrays  as  a  basic  data  type  (a  capability  provided  by  no  commercial  dambase 
system  at  this  time).  Storing  such  images  as  tuples  in  a  relational  datat^  system  is  generally  either  impossi-^^ 

blec terribly inemotait.  „cr  rtJ- 
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The  EXODUS  project  at  the  University  of  Wisconsin  [CareSS,  Care86a,  Care86b,  Giae87,  Rich87]]  is 
addressing  the  problems  posed  in  these  emerging  applications  by  providing  tools  that  will  enable  the  rapid  imple¬ 
mentation  of  high-performance,  application-specific  database  systems.  EXODUS  provides  a  set  of  kernel  facilities 
for  use  across  all  applications,  such  as  a  versatile  storage  numager  and  a  general-purpose  manager  for  type-related 
dqjendency  information.  In  addition,  EXODUS  provides  a  set  of  tools  to  help  the  database  implement^'  (DBI)  to 
develop  new  database  system  software.  The  implementation  of  some  DBMS  components  is  supported  by  tools 
which  actually  generate  the  components  from  specifications;  for  example,  tools  are  provided  to  generate  a  query 
optimizer  firom  a  rule-based  description  of  a  data  model,  its  operators,  and  their  implementations.  Other  com¬ 
ponents,  such  as  new  abstract  dam  types,  access  methods,  and  database  operations,  must  be  explicitly  coded  by  the 
DBI  due  to  their  more  widely-varying  arxl  highly  algorithmic  nature.'  EXODUS  attempts  to  simplify  this  aspect  of 
the  DBI’s  Job  by  providing  a  set  of  high-leverage  programming  language  constructs  for  the  DBI  to  use  in  writing  the 
code  for  these  components. 

In  Section  2,  we  describe  the  architecture  of  EXODUS  in  more  detail.  The  current  status  of  the  project  is 
presented  in  Section  3  along  with  a  list  of  the  technical  accomplishments  of  the  project  A  listing  of  the  technical 
reports  produced  as  part  of  the  project  is  contained  in  Section  4. 

2.  AN  OVERVIEW  THE  EXODUS  ARCHITECTURE 

Since  EXODUS  is  basically  a  collection  of  components  and  tools  that  can  be  used  in  a  number  of  different 
ways,  describing  EXODUS  is  more  difficult  than  describing  the  structure/organization  of  other  extensible  database 
system  designs  that  have  appeared  recently.  We  believe  that  the  flexibility  provided  by  the  EXODUS  approach  will 
make  the  system  usable  for  a  much  wider  variety  of  applications  (as  we  will  discuss  later). 

The  fixed  components  of  EXODUS  include  the  EXODUS  Storage  Object  Manager,  for  managing  persistent 
objects,  and  a  generalized  Dependency  Manager  (formerly  called  the  Type  Manager),  for  keqring  track  of  informa¬ 
tion  about  various  type-related  dependencies.  In  addition  to  these  fixed  components,  EXODUS  also  provides  tools 
to  aid  the  DBI  in  the  construction  of  application-specific  database  systems.  One  such  tool  is  the  £  programming 
language,  which  is  provided  for  developing  new  database  software  components.  A  related  resource  is  a  type- 

*AcDully,  QCODUS  wiU  provide  i  library  of  generally  uselid  components,  such  as  widely-applicable  access  methods  including  B-f  trees 
and  some  form  of  dynamic  hashing,  but  the  DBI  most  implement  componenu  that  are  not  available  in  the  library. 
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independent  module  library,  E’s  generator  classes  and  iterators  can  be  used  to  produce  useful  modules  (e.g.,  vari¬ 
ous  access  methods)  that  are  independent  of  the  types  at  the  objects  on  which  they  operate,  and  these  modules  can 
then  be  saved  away  for  future  use.  AruMher  class  of  tools  are  provided  for  generating  components  from  a 
specification.  An  example  is  the  EXODUS  rule-based  Query  Optimizer  Generator.  We  also  envision  providing 
similar  generator  tools  to  aid  in  the  construction  of  the  fiont-end  portions  of  an  application-specific  DBMS.  The 
components  of  EXODUS  are  described  further  in  the  following  sections.  More  detail  on  the  Stcnage  Object 
Manager  can  also  be  found  in  [Care86a],  the  E  programming  language  is  described  in  [RichST],  and  details  regard¬ 
ing  the  Quay  Optimizer  Generator  and  an  initial  evaluation  of  its  performance  can  be  found  in  [GraeST]. 

2.1.  The  Storage  Object  Manager 

The  Storage  Object  Manager  provides  storage  objects  for  storing  data  and  files  for  logically  and  physically 
groiqring  storage  objects  together.  Also  provided  are  a  powerful  buffer  manager  that  buffers  variable-length  pieces 
of  large  storage  objects,  primitives  for  managing  versions  of  storage  objects,  and  concurrency  control  and  recovery 
services  for  operations  on  storage  objects  and  files. 

A  storage  object  is  an  uninterpreted  container  of  bytes  which  can  be  as  small  (e.g.,  a  few  bytes)  or  as  large 
(e.g.,  hundreds  of  megabytes)  as  demanded  by  an  application.  The  distinction  between  small  and  large  storage 
objects  is  hidden  from  higher  layers  of  EXODUS  software.  Small  storage  objects  reside  within  a  single  disk  page, 
whoeas  large  storage  objects  occupy  potentially  many  disk  pages.  In  either  case,  the  object  identifier  (OID)  of  a 
storage  object  is  an  address  of  the  form  (page  #,  slot  #).  The  OID  of  a  small  storage  object  points  to  the 
object  on  disk;  for  a  large  storage  object,  the  OID  points  to  its  large  object  header.  A  large  object  header  can 
reside  on  a  slotted  page  with  other  large  object '  eaders  and  small  storage  objects,  and  it  contains  pointers  to  other 
pages  involved  in  the  representation  of  the  large  object.  Pages  in  a  large  storage  object  are  private  to  that  object 
(although  pages  are  shared  between  versions  of  a  large  storage  object).  When  a  small  storage  object  grows  to  the 
point  where  it  can  no  longer  be  accommodated  on  a  single  page,  the  Storage  Object  Manager  will  automatically 
convert  it  into  a  large  stcMage  object,  leaving  its  header  in  place  of  the  original  small  object. 

All  read  requests  specify  an  OID  and  a  range  of  bytes;  the  desired  range  of  bytes  is  read  into  a  contiguous 
region  in  the  buffer  pool  (even  if  the  bytes  are  distributed  over  several  partially  full  pages  on  disk),  and  a  pointer  to 
the  bytes  is  returned  to  the  caller.  Bytes  may  be  overwritten  directly,  using  this  pointer,  and  a  call  is  provided  to  tell 


the  Straage  Object  Manager  that  a  subrange  of  the  bytes  diat  were  read  have  been  modified  (information  needed  fw 
recovery  to  take  place).  For  shrinking/growing  storage  objects,  calls  to  insert  bytes  into  and  delete  bytes  firom  a 
specified  offset  within  a  storage  object  are  provided,  as  is  a  call  to  append  bytes  to  the  end  of  an  object  To  make 
these  q[)eiations  efficient  large  storage  objects  are  rqsresented  using  a  B-t-  tree  structure  to  index  data  pages  on  byte 
offset 

The  Storage  Object  Manager  also  provides  support  for  versions  of  storage  objects.  In  the  case  of  small 
storage  objects,  versioning  is  implemented  by  making  a  copy  of  the  entire  object  before  applying  the  update.  Vi¬ 
sions  of  large  storage  objects  are  maintained  by  copying  and  updating  only  those  pages  that  differ  from  version  to 
version.  The  Storage  Object  Manager  also  supports  the  deletion  of  a  version  with  respect  to  a  set  of  other  versions 
with  which  it  may  share  pages.  The  reason  for  only  pro^nding  a  primitive  level  of  version  support  is  that  different 
EXODUS  applications  may  have  widely  different  notions  of  how  versions  should  be  supported.  We  do  not  omit 
version  management  altogether  for  efficiency  reasons  —  it  would  be  prohibitively  expensive,  both  in  terms  of 
storage  space  and  I/O  cost,  if  clients  were  required  to  maintain  versions  of  large  objects  externally  by  making  entire 
copies. 

For  concurrency  control,  two-phase  locking  of  byte  ranges  within  storage  objects  is  used,  with  a  "lock  entire 
object"  option  being  provided  for  cases  where  object-level  locking  will  suffice.  To  ensure  the  integrity  of  the  inter¬ 
nal  pages  of  large  storage  objects  during  insert  append,  and  delete  operations  (e.g.,  while  their  counts  and  pointers 
are  being  changed),  non-two-phase  B-f  tree  locking  protocols  are  employed.  For  recovery,  small  storage  objects  are 
handled  by  logging  changed  bytes  and  performing  updates  in  place  at  the  object  level.  Recovery  for  large  storage 
objects  is  handled  using  a  combination  of  shadowing  and  logging  —  updated  internal  pages  and  leaf  blocks  are  sha¬ 
dowed  up  to  the  root  level,  with  updates  being  installed  atomically  by  overwriting  the  old  object  header  with  the 
new  one.  A  similar  scheme  is  used  for  versioned  objects,  but  the  before-image  of  the  updated  large  object  header 
(or  entire  small  object)  is  retained  as  an  old  version  of  the  object 

Finally,  the  Storage  Object  Manager  provides  the  notion  of  a  file  object.  A  file  object  is  an  unordered  set  of 
related  storage  objects,  and  is  useful  in  several  different  ways.  First,  the  Storage  Object  Manager  provides  a 
mechanism  for  sequencing  through  all  of  the  objects  in  a  file,  so  related  objects  can  be  placed  in  a  common  file  for 
sequential  scanning  purposes.  Second,  objects  within  a  given  file  are  placed  on  disk  pages  allocated  to  the  file,  so 
file  objects  provide  support  for  objects  that  need  to  be  co-located  on  disk. 
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12.  The  Dependency  Manager 


The  EXODUS  Dq)endency  Manager  is  a  repository  for  information  related  to  persistent  types.^  It  maintains 
information  about  all  of  the  pieces  (called  fragments)  that  make  up  a  compiled  query,  including  type  definitions  and 
other  E  code,  and  about  their  relationships  to  one  another.  It  also  keeps  track  of  the  relationship  between  files  and 
their  types  (by  treating  files  in  a  maimer  similar  to  fragments).  In  short,  the  Dependency  M^ger  keeps  track  of 
dependencies  between  types  and  most  everything  else  that  is  related  to  or  dependent  upon  such  information. 

More  specifically,  certain  time  ordering  constraints  must  hold  between  the  fiagments  constituting  a  complete 
query.  For  example,  a  compiled  query  plan  must  have  been  created  more  recently  than  the  program  text  for  any  of 
the  types  or  database  operations  that  it  employs,  as  otherwise  out-of-date  code  will  have  been  used  in  its  creation.  A 
given  abstract  data  type,  or  a  set  of  operations,  is  also  likely  to  have  multiple  representations  (e.g.,  E  source  code,  an 
intermediate  representation,  and  a  linkable  object  file),  and  similar  time  ordering  constraints  must  hold  between 
these  representations.  The  Dependency  Manager’s  role  is  thus  similar  to  the  Unix™  make  facility  [Feld79]. 
Unlike  make,  which  only  examines  dependencies  and  timestamps  when  it  is  started  up,  the  Dependency  Manager 
maintains  a  graph  of  inter-fragment  dependencies  at  all  times  (and  updates  it  incrementally). 

The  Dependency  Manager  also  plays  a  role  in  maintaining  data  abstraction  that  distinguishes  it  from  make. 
In  particular,  a  given  type  used  by  a  query  plan  is  likely  to  use  other  types  to  constitute  its  internal  representation. 
Strictly  speaking,  the  first  tvpe  is  not  dependent  upon  the  linkable  object  code  of  its  constituent  types’  operations; 
that  is,  while  it  must  eventually  be  linked  with  their  code,  it  is  not  necessary  that  their  object  code  be  up  to  date,  or 
even  compiled,  until  link  time.  We  call  fragments  of  this  sort  companions-,  make  has  no  facilities  for  specifying 
and  using  companions.  The  Dependency  Manager  requires  such  a  facility,  as  otherwise  it  would  be  unable  to  pro¬ 
vide  a  complete  list  of  the  objects  constituting  the  compiled  access  plan  for  a  query,  which  is  necessary  when  a 
query  is  to  be  linked. 

The  Dqrendency  Manager  maintains  the  correct  time  ordering  of  firagments  via  two  mechanisms,  rules  and 
actions.  The  set  of  fragments  constitutes  the  nodes  of  an  acyclic  directed  graph;  rules  generate  the  arcs  of  this 
graph.  When  a  fragment  is  found  to  be  older  than  those  fragments  upon  which  it  depends  (with  the  dependencies 
being  determined  from  the  rules),  a  search  is  made  for  an  appropriate  action  that  can  be  performed  to  bring  the 

^Ai  we  will  explain  thoitly,  new  types  in  EXODUS  tie  defined  using  the  class  and  dbclass  conttmcti  of  the  E  progiaimning 
language. 


fragment  up  to  date.  Both  rules  and  actions  are  defined  using  a  syntax  based  on  regular  expressions  to  allow  a  wide 
range  of  default  dependencies  to  be  q)ecified  conveniently. 

23.  The  E  Programming  Language 

A  major  tool  provided  by  EXODUS  is  the  E  programming  language  and  its  compiler.  E  is  a  extension  of  C-h- 
[Stro86]  that  aids  the  DBI  in  a  number  of  problem  areas  related  to  database  system  programming,  including  interac¬ 
tion  with  persistent  storage,  accommodation  of  missing  type  (class)  information,  and  query  compilation.  E  is 
designed  to  be  upward  compatible  with  C++,  and  its  extensions  include  both  new  language  features  and  a  number  of 
predefined  classes. 

E  was  designed  with  the  following  database  system  architecture  in  mind:  First,  all  access  methods,  data 
model  operators,  and  utility  functions  are  written  in  E.  In  addition  to  these  modules,  the  Storage  Object  Manager, 
and  the  Dependency  Manager,  the  database  system  includes  the  E  compiler  itself.  At  run  time,  database  schema 
definitions  (e.g.,  create  relation  commands)  and  queries  are  first  translated  into  E  programs  and  then  compiled.  One 
result  of  this  architecuire  is  a  system  in  which  the  "impedance  mismatch”  [Cope84]  between  type  systems  disap¬ 
pears.  Another  is  that  the  system  is  easy  to  extend.  For  example,  the  DBI  may  add  a  new  Ham  type  by  implement¬ 
ing  it  as  an  E  class,  storing  its  definition  and  implementation  in  files,  and  registering  the  resulting  module  with  the 
Dependency  Manager  for  later  use. 

The  following  paragrtq>hs  describe  some  of  the  more  important  features  of  E  from  the  standpoint  of  the  DBI. 
More  details  can  be  found  in  [Rich87]. 

2  J.l.  Generator  Classes  for  Unknown  Types 

One  of  the  problems  faced  by  the  DBI  is  that  many  of  the  types  involved  in  database  processing  are  not 
known  until  well  after  the  code  needing  those  types  is  written.  For  example,  the  code  implementing  a  hash-join 
algorithm  does  not  know  what  types  of  entities  it  will  have  to  join.  Similarly,  index  code  does  not  know  what  types 
of  keys  it  will  contain  nor  what  type  of  entities  it  will  index. 

To  address  this  problem,  E  augments  C++  with  generator  classes,  which  are  very  similar  to  the  parameterized 
clusters  of  CLU  [Lisk77].  Such  a  class  is  parameterized  in  terms  of  one  or  more  unknown  types;  within  the  class 
definition,  these  (fnmal)  type  names  are  used  freely  as  regular  type  names.  This  mechanism  allows  one  to  define, 
for  example,  a  class  of  the  form  stack  [  T  ]  where  the  specific  type  (class)  T  of  the  stack  elements  is  not 


known.  The  user  of  such  a  class  must  instantiate  it  by  providing  specific  parameters  to  the  e.g.,  one  may 
declare  x  to  be  integer  stack  via  the  declaration  stack  [  int  ]  x.  Similarly,  the  DBI  can  define  the  type 
of  a  B+  tree  node  as  a  class  in  which  both  the  key  type  and  the  type  of  entity  being  indexed  are  class  parameters. 
Latm',  when  the  user  builds  an  index  over  employees  on  social  security  number,  the  system  generates  and  compiles  a 
small  E  fiagment  which  instantiates  BTnode[  SSN_type,  EMP_type  ].  Such  instantiation  can  be 
efficiently  accomplished  via  a  linking  process  [Atki781. 

13  J..  Class  filcofi  T  ]  for  Persistent  Storage 

Another  problem  in  database  system  programming  is  that  most  file  systems  provide  the  DBI  only  with 
untyped  storage.  Thus,  after  being  read  from  disk,  all  data  must  be  explicitly  type  cast  in  the  DBFs  code  before  it 
can  be  operated  upon.  In  addition,  since  the  data  resides  on  secraidary  storage,  the  DBI  must  include  explicit  calls 
to  the  buffer  manager  in  order  to  use  it.  These  factors  increase  the  amount  of  code  that  the  DBI  must  write,  and  they 
also  provide  increased  opportunities  for  coding  errors. 

E’s  answer  to  this  problem  is  the  "built-in"  generate  class  fileof[  T  ]  where  T  must  be  a  dbclass. 
A  dbclass  is  declared  in  the  same  way  as  a  C++  class  with  the  restriction  that  a  dbclass  may  contain  only  other 
dbclasses.  (Predefined  dbclasses  exist  for  the  fundamental  types  int,  float,  char,  etc.)  Dbclasses  were  introduced  so 
that  the  compiler  can  always  distinguish  between  objects  residing  only  on  the  heap  and  those  that  generally  reside 
on  disk  (but  may  also  reside  in  memory)  since  the  implementation  of  the  two  is  very  different. 

Instances  of  the  f  ileof  generator  class  are  implemented  as  a  descriptor  (in  memory)  associated  with  a  phy¬ 
sical  file  (on  disk).  This  implementation  is  hidden  behind  an  operational  interface  that  allows  the  user  to  bind  typed 
pointers  to  objects  in  a  file,  to  create  and  destroy  objects  in  a  file,  etc.  For  example,  the  following  function  returns 
the  sum  of  all  the  integers  in  a  file  of  integers.  (The  file  is  passed  by  reference.) 

irjt  £llesum(  f ileof  [dbint] s  f  ) 

{ 

dbint  *p;  /*  dbint  is  the  predefined  dbclass  for  int  *! 

int  sum  -  0; 

for{  p  -  f .getf irst 0 ;  p  !-  0;  p  -  f.getnext(  p  >  )  sum  +-  *p; 
return  sum; 

1 

Although  this  example  is  extremely  simple,  it  illustrates  the  two  features  mentioned  above.  The  first  is  that  no  cast¬ 
ing  is  needed  to  use  the  integer  pointer  p;  the  second  is  that  no  buffer  calls  are  necessary  to  access  the  objects  in 
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file  f .  Clearly,  an  impmtant  research  direction  related  to  the  implementation  of  E  is  the  optimization  of  the  calls  to 
the  buffer  manager  generated  by  the  E  compiler  (especially  for  files  containing  very  large  objects  such  as  images). 

233.  Iterators  for  Scans  and  Query  Processing 

A  typical  tgjproach  for  structuring  a  database  system  is  to  include  a  layer  which  provides  scans  over  objects  in 
the  database.  A  scan  is  a  control  abstraction  which  provides  a  state-saving  interface  to  the  "memoryless"  storage 
systems  calls;  this  interface  is  needed  for  the  record-at-a-time  processing  done  in  higher  layers.  A  typical  imple¬ 
mentation  of  scans  will  allocate  a  data  structure,  called  a  scan  descriptor,  to  save  all  needed  state  between  calls;  it  is 
up  to  the  user  to  pass  the  descriptor  with  every  call. 

The  control  abstraction  of  a  scan  is  provided  in  EXODUS  via  the  notion  of  an  iterator  [Lisk77,  OBri86].  An 
iterato'  is  a  coroutine-like  function  that  saves  its  data  and  control  states  between  calls;  each  time  the  iterator  pro¬ 
duces  (yields)  a  new  value,  it  is  suspended  until  resumed  by  the  client.  Thus,  no  matter  how  complicated  the 
iterator  may  be,  the  client  only  sees  a  steady  stream  of  values  being  produced.  Finally,  for  implementation  reasons, 
the  client  can  only  invoke  an  iterator  within  a  new  kind  of  structured  statement,  the  iterate  loop  (which  general¬ 
izes  the  for  ...  in  loop  of  CLU). 

The  general  idea  for  implementing  scans  should  now  be  clear.  For  example,  to  implement  a  scan  over  6+ 
trees,  we  would  write  an  iterator  function  which  takes  a  B-f  tree,  a  lower  bound,  and  an  upper  bound  as  arguments. 
It  would  begin  by  searching  down  to  the  leaf  level  of  the  tree  for  the  lower  bound,  keeping  a  stack  of  node  pointers 
along  the  way.  It  would  then  walk  the  tree,  yielding  object  references  one  at  a  time,  until  the  upper  bound  is 
reached.  At  that  point,  the  iterator  would  terminate. 

Iterators  are  also  used  to  piece  executable  queries  together  from  a  parse  tree.  If  we  consider  a  query  to  be  a 
pipeline  of  processing  filters,  then  each  stage  can  be  implemented  as  an  iterator  which  is  a  client  of  one  or  more 
iterators  (upstream  in  the  pipe)  and  which  yields  its  results  to  the  next  stage  (downstream  in  the  pipe).  Execution  of 
the  pipeline  will  be  demand-driven  in  luture.  For  example,  the  DBI  for  a  relational  DBMS  would  write  code  for 
select,  project,  and  join  as  iterators  implementing  filters.  Given  the  parse  tree  of  a  user  query,  it  is  a  fairly  simple 
task  to  produce  E  code  that  implements  the  pipeline. 
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2.4.  Type-Independent  Access  Methods  and  Operator  Methods 

Layered  above  the  Storage  Object  Manager  is  a  collection  of  access  methods  that  provide  associative  access 
to  files  of  storage  objects  and  further  support  for  versioning  (if  desired).  For  access  methods.  EXODUS  will  pro¬ 
vide  a  library  of  type-independent  index  structures  including  B-t-  trees.  Grid  files  [Niev84],  and  linear  hashing 
[LitwSO].  These  access  methods  will  be  implemented  using  the  class  generator  and  iterator  capabilities  provided  by 
the  E  programming  language.  This  capability  enables  gristing  access  methods  to  be  used  with  DBI-defined  abstract 
data  types  without  modification  —  as  long  as  the  capabilities  provided  by  the  data  type  satisfy  the  requirements  of 
the  access  methods.  In  addition,  a  DBI  may  wish  to  implement  new  types  of  access  methods  in  the  process  of 
developing  an  application-specific  database  system.  EXODUS  provides  mechanisms  U>  greatly  simplify  this  task. 
First,  since  new  access  methods  are  written  in  E,  the  DBI  is  shielded  from  having  to  map  tiurin  memmy  data  struc¬ 
tures  onto  storage  objects  and  from  having  to  write  code  to  deal  with  buffering.  E  will  also  simplify  the  task  of  han¬ 
dling  concurrency  control  and  recovery  for  new  access  methods. 

Layered  above  the  access  methods  is  a  set  of  operator  methods  that  implement  the  operations  of  the 
application’s  chosen  data  model.  As  for  access  methods,  the  class  generamr  and  iterator  facilities  of  E  facilitate  the 
development  of  operator  methods.  Generally  useful  methods  (e.g.,  selection)  will  be  made  available  in  a  type- 
independent  library;  methods  specific  U)  a  given  application  domain  will  have  to  be  developed  by  the  DBI. 

23.  The  Rule-Based  Query  Optimizer  Generator 

Since  we  expect  that  EXODUS  will  be  used  for  a  wide  variety  of  applications,  each  with  a  potentially  dif¬ 
ferent  query  language,  it  is  not  possible  for  EXODUS  to  furnish  a  single  generic  query  language,  and  it  is  accord¬ 
ingly  inr^ssible  for  a  single  query  optimizer  to  suffice  for  ail  applications.  As  an  alternative,  a  generator  for  pro¬ 
ducing  query  optimizers  for  algebraic  query  languages  has  been  implemented.  The  input  to  the  query  optimizer 
generator  is  a  collection  of  rules  regarding  the  operators  of  the  target  query  language,  the  transformations  that  can 
be  legally  applied  to  these  operators  (e.g.,  pushing  selections  before  joins),  and  a  description  of  the  methods  that  can 
be  used  to  execute  each  operator  in  the  query  language  (including  their  costs  and  side  effects).  The  Query  Optim¬ 
izer  Generator  transforms  these  description  files  into  C  source  codc^.  producing  an  optimizer  for  the  application’s 
query  language.  Later,  to  optimize  queries  using  the  resulting  optimizer,  a  query  is  first  parsed  and  converted  into 

’Note:  While  E  if  the  luiguege  lh»t  the  DBI  will  uie  to  implement  a  DBMS,  we  are  implementing  the  various  oomponents  of  EXODUS  in 
C 
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its  initial  fonn  as  a  tree  of  (^)erators;  it  is  then  transformed  by  the  generated  optimizer  into  an  optimized  execution 
plan  exfnessed  as  a  tree  of  methods.  During  the  process  of  optimizing  a  query,  the  optimize'  avoids  exhaustive 
search  by  using  AI  search  techniques  and  employing  past  Gcamed)  experience  to  direct  the  search.  As  described 
above,  each  method  in  the  tree  produced  by  the  optimizer  is  implemented  as  an  iterator  generator  in  E.  Thus,  a 
post-optimization  pass  over  the  plan  tree  is  made  to  produce  E  code  corresponding  to  the  plan.  Fot  queries  involv¬ 
ing  more  than  one  operator,  the  iterators  are  nested  in  a  maimer  that  allows  the  query  to  be  processed  in  a  pipelined 
fashion,  as  mentioned  earlier. 

2.6.  Application-Specific  DBMS  Development 

Figure  1  presents  a  sketch  of  the  architecture  of  a  functionally  complete,  application-specific  database  system 
implemented  using  EXODUS.  The  components  in  Figure  1  that  are  implemented  by  the  DBI  in  E  are  the  access 
methods  and  operator  methods.  As  discussed  above,  EXODUS  provides  a  library  of  type-independent  access 
methods,  so  it  might  not  be  necessary  for  a  DBI  to  actually  implement  any  access  methods.  EXODUS  will  also  pro¬ 
vide  a  library  of  methods  for  a  number  of  operators  that  operate  on  a  single  type  of  storage  object  (e.g.,  selection), 
but  it  will  not  provide  application  or  data  model  specific  methods.  For  example,  since  a  method  for  examining 
objects  containing  satellite  image  data  for  the  signature  of  a  particular  crop  disease  would  not  be  useful  in  genoal,  it 
does  not  belong  in  such  a  library.  In  general,  the  DBI  will  need  to  implement  (using  E)  one  or  more  methods  for 
each  operator  in  the  query  language  associated  with  the  target  application.  The  DBI  must  also  write  code  (i.e., 
dbclass  member  fuiKtions)  for  the  operations  associated  with  each  new  abstract  data  type  that  he  or  she  wishes  to 
define. 

To  clarify  by  using  a  familiar  example,  a  DBI  who  wanted  to  implement  a  relational  DBMS  for  business 
applications  via  the  EXODUS  approach  would  have  to  obtain  code  for  the  desired  access  methods  (c.g.,  B-t-  trees 
and  linear  hashing)  by  extracting  existing  code  from  the  library  and/or  by  writing  the  desired  code  from  scratch  in  E. 
Similarly,  code  must  be  obtained  for  the  operator  methods  (e.g.,  relation  scan,  indexed  selection,  nested  loops  join, 
merge  join,  etc.)  and  for  various  useful  types  (e.g..  date  and  money).  A  DBI  implementing  a  database  management 
system  for  an  image  application  would  have  to  implement  an  analogous  set  of  routines,  presumably  including  vari¬ 
ous  spatial  index  structures,  operations  that  manipulate  collections  of  images,  and  an  appropriate  set  of  types.  As 
discussed  earlier,  E  is  provided  to  greatly  simplify  these  programming  tasks. 
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Figure  1:  An  EXODUS-Based  DBMS. 

Finally,  the  top  level  of  the  EXODUS  architecture  consists  of  a  set  of  components  that  are  generated  from 
DBl  specifications.  One  such  component  is  the  query  optimizer  and  compiler.  We  also  plan  to  investigate  and 
develop  tools  to  automate  the  process  of  producing  new  DML/DDL  components,  which  are  the  query  parser  and 
DDL  supptvt  components  shown  in  Figiue  1.  (This  idea  is  similar  to  the  data  model  compiler  notion  of  [Mary86].) 
DML  components  generate  operator  trees  to  be  fed  to  die  query  optimizer,  while  DDL  components  produce  com¬ 
piled  E  code;  that  is,  user-level  schema  definitions  result  in  the  definition  of  associated  E  types  (which  are  stored 
away  and  registered  with  the  Dependency  Manager)  and  E  code  to  create  the  associated  EXODUS  files. 

Note  that  is  also  possible  to  use  E  as  a  lower-level  mechanism  for  accessing  a  database  directly,  for  applica¬ 
tions  needing  such  low-level  access.  Assume  that  one  has  used  the  tools  provided  by  EXODUS  to  construct  an 
application-specific  database  system.  "Normal"  accesses  to  the  database  would  be  processed  through  its  ad-hoc  or 
embedded  query  interfaces,  while  those  applications  needing  direct  access  to  storage  objects  would  be  developed 
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using  E.  Since  schema  infwmation  for  all  storage  objects  is  maintained  internally  in  E  foim,  the  application  pro¬ 
grams  can  access  storage  objects  corresponding  to  entity  instances  that  were  created  via  the  ad-hoc  query  interface. 
One  could  also  layer  an  application  program  on  top  of  the  access  methods  or  operator  methods  layers  without  neces¬ 
sarily  using  the  front-end  portion  of  the  system.  Thus,  one  shared  database  can  be  used  by  both  types  of  applica¬ 
tions  with  little  ot  no  loss  of  efficiency  and  minimal  loss  of  data  independence.  For  certain  applications,  the  availa¬ 
bility  of  such  a  direct  interface  is  critical  to  obtain  reasonable  performance  [Rube87].  The  flexibility  of  the 
EXODUS  approach  to  extensible  database  systems  will  enable  users  to  customize  the  system  to  fit  such  needs. 

3.  PROJECT  STATUS 

At  the  current  time  a  first  version  of  the  EXODUS  Storage  Manager  and  E  compiler  have  been  completed  and 
integrated  with  each  other.  Currently,  we  are  working  on  a  second  version  of  the  storage  manager  that  will  be  able 
to  exploit  the  capabilities  of  shared-memory  multiprocessors.  A  number  of  industrial  firms  have  contacted  Wiscon¬ 
sin  regarding  the  acquisition  of  the  storage  manager.  The  EXODUS  optimizer  generator  is  a  completed  piece  of 
software  that  has  been  provided  to  both  Bell  Labs  and  Hewleu-Packard  Labs  for  experimentation.  An  evaluation  of 
the  optimizer  generator  indicates  that  a  rule-based  query  optimizer  produces  query  plans  that  are  competitive  with 
those  produced  by  commercial  query  optimizers. 

While  an  operational  E  compiler  now  exists,  the  compiler  does  not  yet  optimize  the  movement  of  storage 
objects  between  disk  and  main  memory.  Other  impratant  aspects  of  the  compiler  that  remain  to  be  implemented 
(which  are  not,  however,  viewed  as  being  technically  difficult)  are  support  for  full  interitance,  completing  the  file 
abstraction  provided  by  the  language,  and  destructors  for  persistent  objects.  If  E  is  to  be  "compefitive”,  the  perfor¬ 
mance  of  access  and  operator  methods  written  in  E  must  be  comparable  with  the  same  algorithms  coded  in  C  by 
"experts"  directly  against  the  EXODUS  Storage  Manager.  We  are  currently  studying  alternative  techniques  (such  as 
classical  register  allocation  techniques  and  loop  parallelization  strategies)  for  minimizing  loads  and  stores  of 
objects. 

As  a  demonstration  of  power  of  the  EXODUS  toolkit,  during  April  of  this  year  we  produced  a  demonstration 
relational  database  management  system  using  the  entire  toolkit.  The  result  of  this  effort  was  demonstrated  at  the 
1988  SIGMOD  conference.  Access  methods  and  operator  methods  were  written  using  E.  Relational  queries,  after 
being  parsed,  were  optimized  using  an  optimizer  generated  using  the  EXODUS  optimizer  generator  and  then  were 
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translated  to  E.  Finally,  queries  were  compiled  and  dynamically  loaded  into  the  database  server  for  execution.  The 
demonstration  was  very  well  received  by  the  attendees  of  the  conference.  While  this  demonstration  vehicle  served 
to  illustrate  the  power  of  the  EXODUS  toolkit  (we  put  it  together  in  about  4  weeks),  it  pointed  out  the  need  for 
several  additional  tools.  We  wiU  be  working  on  these  tools  in  the  coming  year  as  part  of  putting  the 
EXHIA/EXCESS  system  together. 

We  intend  to  distribute  the  components  of  the  EXODUS  toolkit  as  they  become  available.  The  EXODUS 
Optimizer  Generator  has  already  been  distributed  to  HP  Labs  and  AT&T  Bell  Labs.  We  plans  to  release  the  second 
version  of  the  Storage  Manager  by  September  1988  and  the  first  version  of  the  E  compiler  by  December  1988. 


Summary  of  Accomplishments 
Accomplishments  before  FY88: 

-  Completed  overall  design  of  the  EXODUS  database  system. 

-  Completed  design  for  the  EXODUS  Storage  Manager  and  began  its  implementation. 

-  Completed  the  design  of  the  E  programming  language  and  began  its  implementation. 

-  Completed  the  implementation  and  performed  an  evaluation  of  the  EXODUS  optimizer  generator. 

-  Comideted  design  and  implementatiai  of  the  EXODUS  Dependency  Manager. 

Accomplishments  for  FY88: 

-  Completed  a  first  version  of  the  EXODUS  Storage  Manager. 

-  Completed  a  detailed  design  of  recovery  algorithms  for  the  EXODUS  storage  manager. 

•  Completed  a  first  version  of  the  E  compiler  which  includes:  generators,  iterators,  and  persistence. 

-  Designed  the  EXTRA  data  model  and  EXCESS  query  language. 

•  Produced  a  demonstration  relational  database  management  system  using  the  entire  EXODUS  toolkit 
Objectives  for  FY89: 

-  Begin  distribution  of  the  EXODUS  Storage  Manager 

•  Complete  the  design  of  the  EXTRA/EXCESS  object-oriented  database  system  and  begin  its  implementa- 
ion. 

-  Investigate  query  optimization  in  the  context  of  object-oriented  database  systems  —  in  particular,  within 
the  context  of  EXIHA  and  EXCESS. 

-  Research  in  performance  for  complex  objects  through  replication  —  in  particular,  within  the  context  of 
EXTRA  and  EXCESS. 

-  Design  and  evaluate  alternative  optimization  strategies  for  E  and  incorporate  the  best  strategy  in  the  E 
compiler.  Also  provide  full  support  for  inheritance  and  virtual  functions.  Implement  the  "program- 
fragment"  server  for  E  program  support 


4.  KEY  TECHNICAL  REPORTS 


The  Architecture  cf  the  EXODUS  Extensible  DBMS.  M.  Carey,  D.  DeWiu,  D.  Frank,  G.  Graefe,  J.  E.  Richardson,  E. 
J.  Shekita  and  M.  Muralikrishna,  Proceedings  of  the  International  Workshop  on  Object  Orient^  Database  Systems, 
Asilomar,  CA.,  September,  1986 

Object  and  File  Management  in  the  EXODUS  Extensible  Database  System.  M.  Carey,  D.  DeWitt,  J.  Richardson, 
and  E.  Shekita,  Proceedings  of  the  1986  VLDB  Conference,  Japan,  August  1986. 

Programming  Constructs  for  Database  System  Implementation  in  EXODUS,  J.  Richardson  and  M.  Carey,  Proceed¬ 
ings  of  the  1987  SIGMOD  Conference,  San  FtatKisco,  CA,  May,  1987. 

The  EXODUS  Optimixer  Generator.  G.  Graefe  and  D.  DeWitt,  Proceedings  of  the  1987  SIGMOD  Conference,  San 
Francisco,  CA,  May  1987. 

A  Data  Model  and  Query  Language  for  EXODUS.  M.  Carey,  D.  DeWitt,  and  S.  Vandenberg)  Proceediitgs  of  the 
1988  SIGMOD  Conference,  Chicago,  Ill.,  June,  1988. 

Persistence  in  EXODUS.  Richardson,  J.,  Carey,  M.,  DeWitt,  D.,  and  Shekita,  E.,  Proceedings  of  the  Appin 
Workshop  on  Persisteru  Object  Systems.  Appin,  Gotland,  August  1987. 

Implementing  Persistence  in  E.  Richardson,  J.,  and  Carey,  M.,  submitted  to  the  Newcastle  Workshop  on  Persistent 
Object  Systems,  September,  1988. 

Persistence  in  the  E  Language;  Issues  and  Implementation.  Richardson,  J.,  and  Carey,  M.,  submitted  to  Software 
Practice  and  Experience,  Sqxember,  1988. 

Storage  Management  for  Objects  in  EXODUS,  Carey,  M.,  DeWitt,  D.,  Richardson,  J.,  and  Shekita,  E.,  in  Object- 
Oriented  Concepts,  Applications,  and  Databases,  W.  Kim  and  F.  Lochovsky,  eds.,  Addison- Wesley  Publishing 
Co.,  1988,  to  appear. 
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1.  Introduction 

This  lepon  summarizes  the  activities  suf^xwted  by  DOD  contract  N00014-85-K0788  and  the  results 
of  those  activities. 

Our  continuing  goal  was  to  characterize  the  algorithms  for  which  very  large  multicomputers  ate 
apprtqviaie.  Multicomputers  ate  one  promising  direction  for  the  construction  of  large-scale  computing 
engines.  Other  directions  include  multqxocessors  (which  share  memMy)  and  uniprocessors  like  the  Cray 
that  employ  pipelining  and  parallel  functional  units.  The  advantages  of  multicomputers  ate  several. 

•  It  is  easy  to  increase  the  powa  of  the  engine  by  adding  more  constituent  computers  (which  we  call 

machines). 

•  Algorithms  do  not  have  to  worry  about  memory  conflicts  between  machines. 

•  No  hardware-development  research  is  required  to  isove  the  fundamental  concepts. 

Our  research  has  been  carried  out  largely  on  tlw  Crystal  multicomputer,  which  was  funded  by  NSF 
(grant  MCS-8105904)  to  purchase  approximately  40  node  machines,  each  a  VAX-1 1/7S0.  These  nodes 
were  wiginally  interconnected  by  a  10  Megabit/sec  ProNet  token  riitg  manufactured  by  Proteon  Coipcnra- 
tkm.  During  the  period  of  this  contract,  the  networic  was  upgrades  to  80  Megabit/sec.  The  purpose  of  this 
hardware  is  to  promote  research  in  distributed  al^xithms  for  a  wide  variety  of  applications.  In  order  to 
provide  different  applications  simultaneous  access  to  the  network  hardware,  Crystid  provides  a  software 
package  the  nugget  that  resides  on  each  node.  The  nugget  eitfotces  allocation  of  the  network  among 
different  triplications  by  virtualizing  communications  within  partidons  of  the  network.  These  partitions  are 
established  interactively  through  a  host  machine.  Interactkm  between  the  user  aitd  individual  machines  is 
provided  by  the  nugget  facility  of  virtual  terminals.  Initial  loading,  control,  and  debugging  of  programs  on 
node  machines  is  controlled  by  nugget  software. 

Crystal  is  being  used  for  a  wide  range  of  applications.  Research  is  underway  in  distributed  tolerating 
systems,  programming  languages  for  distributed  systems,  tools  for  nnonitming,  measuring,  and  drugging 
distributed  systems,  multiprocessor  database  machines,  parallel  algorithms  for  mathematical  programming, 
numerical  analysis  and  computer  vision,  and  evaluating  alternative  protocols  for  high  perftxmance  local 
netwtxk  communications.  Sme  of  those  projects  are  discussed  in  detadl  here. 

Our  most  recent  work  has  concentrated  on  experiments  involving  Charlotte,  development  of  user 
interfaces,  tools  for  distributed  backtracking  and  pandlel  debugging,  distributed  resource  allocation,  algo¬ 
rithms  for  the  implementation  of  distributed  languages,  and  an  algorithm  for  parallel  solution  of  a  network 
flow  problem. 

This  broad  attack  has  been  broadly  successful.  Significant  advances  have  been  made  in  most  of 
these  areas.  In  the  following  sections,  we  summarize  the  most  recent  results  and  refer  the  reader  to  papers 
in  the  appendix  for  details. 

2.  Charlotte 

In  the  early  part  of  the  contract  period,  die  Charlotte  operating  system  stablized  and  further  develop¬ 
ment  of  operating  system  software  ceased.  The  design  of  Charlotte  is  presented  and  discussed  in  a  section 
of  paper  describing  Crystal,  which  has  appeared  in  the  IEEE  Transactions  on  Computers  [24].  We  have 
made  numerous  experiments  using  Charlotte  to  imidement  parallel  solutions  to  a  variety  of  computationally 
expensive  problems.  Reports  on  the  results  of  these  experiments  are  described  in  ihrw  technical  reports, 
which  ate  included  as  af^iendices  [12, 14, 16].  A  more  detailed  description  of  interprocess  communication 
facilities  in  Charlotte  has  qipeated  as  a  technical  report  [13]  and  in  IEEE  Software  [22].  We  have  also 
presented  a  pap«  reflecting  on  lessons  learned  from  the  derign  of  Charlotte  [IS,  26]. 

3.  User  Interfaces 

The  Charlotte  command  interpreter  C'shell”)  is  rather  primitive.  A  project  to  create  a  better  user 
interface  blossomed  into  an  in-depth  research  project  in  the  area  of  user  interfaces.  The  goal  of  this 
research  shifted  from  the  creation  of  a  particular  user  interfiKe  to  tools  for  creating  user  interfaces  in  gen¬ 
eral.  The  product  of  this  investigation,  called  ‘’Dost”,  assumes  the  programmer  has  defined  his  application 
as  a  collection  of  communicating  processes.  Each  application  has  a  display  manager  that  translates 
between  the  language  of  user  interaction  (editing  display  objects  on  the  screen)  and  the  language  of  process 
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interaction  (exchanging  messages).  Thus  the  user  qjpears  to  a{q>lication  processes  to  be  another  process 
that  can  send  and  receive  messages,  while  the  application  processes  (and  the  data  objects  they  manage) 
appear  to  the  user  to  be  objects  that  can  be  directly  manipulated.  The  gennation  of  the  disiday  manage  is 
automated  by  instrumenting  the  compiler  to  produce  tables  describing  the  data  types  defined  by  the  q>pli- 
cation.  Thus  an  application  can  simply  di^lay  some  of  its  variables  and  allow  the  user  to  edit  them, 
without  worrying  about  translation  to  and  from  a  textral  representations. 

Unfortunately,  a  thorough  exploration  of  these  ideas  requires  a  hardware  and  software  environment 
(a  bitmap  display,  mouse,  and  window  system)  that  was  not  available  with  Charlotte  at  die  time  this 
research  was  being  done.  Therefore,  Dost  was  prototyped  in  the  the  Xerox  Develc^iment  Environment  on 
Dandelion  woritstadons. 

This  research  is  described  in  detail  in  a  Ph.  D.  thesis  [3],  and  is  summarized  in  two  conference  ptqiers 
[19,28]  and  a  paper  that  has  been  submitted  for  pidilication  in  the  ACM  Transactions  on  Programming 
Languages  and  Systems  [31]. 

4.  Communications  Interface 

Our  measurements  of  Charlotte  indicated  that  the  inincipal  bottleneck  limiting  perfmmance  was  the 
processing  overhead  associated  with  transmitting  a  messa^  A  study  of  ways  to  mitigate  this  problem  cul¬ 
minated  in  the  design  of  an  alternative  architecture  for  individual  Charlotte  nodes.  The  study  had  four 
phases.  The  first  phase  carefully  measured  a  variety  of  message-based  operating  systems,  including  Char¬ 
lotte.  In  the  second  phase,  we  used  these  measurements  to  design  a  hardware  and  software  architecture 
tailored  to  tninize  the  computational  latency.  The  third  phase  produced  a  working  prototype  of  the 
software  design  on  a  shared-memory  multiprocessor.  This  protot^  demonstrated  the  feasibility  of  the 
design,  and  allowed  us  to  make  detailed  measurements  of  the  load  it  produced  on  the  hardware.  The  fourth 
phase  consisted  of  a  construction  of  mathematical  models  of  various  hardware  architectures,  using  the  for¬ 
malism  of  generalized  timed  Petrie  nets.  These  nrodels  were  sedved,  using  a  srrftwaie  package  produced 
by  other  researchers  at  the  University  of  Wisconsin.  The  results  nicely  match  the  experimental  measure¬ 
ments  derived  in  phase  three,  and  indicated  dim  the  proposed  hamate  architecture  could  achieve 
significant  improvements. 

This  research  is  reported  in  detail  in  a  PhJ>.  dissertation  [4].  Various  aspects  of  it  were  repotted  in 
conferences  [25,27],  and  have  been  submitted  to  journals  [32]. 

5.  Distributed  Backtracking 

One  of  the  principal  investigators  (Finkel)  in  cooperation  with  another  faculty  member  (U.  Manber) 
developed  a  specisd-purpose  tod  for  parallelizing  applications  that  can  be  presented  as  backtracking  qipli- 
cations.  This  tool,  called  DIB  (for  Distributed  Implementation  of  Backtracking),  was  implemented  direedy 
on  top  of  the  Crystal  nugget  (that  is,  not  using  Charlotte).  Results  of  experiments  using  diis  tool  are 
presented  in  the  reports  cited  earlier.  In  addition,  descriptions  of  DIB  and  the  more  interesting  implementa¬ 
tion  problems  have  appeared  as  a  technical  report  [1 1],  a  conference  paper  [18],  and  a  journal  paper  [23]. 
DIB  was  subsequendy  ported  to  a  shared-memory  multiprocessor.  It  also  inspired  another  tool,  called  P/B 
(for  Paralld  Implementation  of  Backtracking)  qiecifically  designed  for  a  multiprocessor.  Research  involv¬ 
ing  both  of  these  tools  continues. 

6.  Parallel  Debugging 

A  research  projea  concerning  tools  for  debugging  in  a  distributed  environment  that  was  begun 
before  the  commencement  of  this  contract  was  completed  and  lead  to  a  Ph.  D.  thesis  [2].  Summaries  of  the 
major  results  were  presented  at  a  workshop  [20],  and  in  a  published  paper  [30]. 

7.  Distributed  Allocation 

Research  in  distributed  resource  allocation,  begun  under  a  predecessor  of  this  contract,  led  to  a  Ph.D. 
thesis.  A  journal  article  presenting  a  particularly  interesting  algorithm  that  arose  from  this  research  has 
been  published  [21]. 
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8.  Miscellaneous  Other  Results 

An  idea  that  arose  in  the  implementation  of  the  Lynx  Ituiguage  (which  was  developed  under  the 
predecessor  to  this  contract)  has  been  published  [29].  Dr.  Finkel,  wesking  in  cooperation  with  other 
researchers,  has  developed  a  parallel  algoridun  for  the  solution  of  a  problem  in  network  flows  [17]. 
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